The Effects of a Production Level “Voice-Command” Interface on Driver Behavior: Summary Findings on Reported Workload, Physiology, Visual Attention, and Driving Performance

نویسندگان

  • Bryan Reimer
  • Bruce Mehler
چکیده

This report summarizes key results of an on-road study assessing perceived workload, physiological arousal, visual attention, and basic driving performance metrics while drivers engaged in a number of tasks with a production version, in-vehicle voice-command system. The same metrics were also evaluated while participants carried out an implementation of the manual radio tuning reference task (Driver Focus-Telematics Working Group, 2006) and three levels of an audio-presentation / verbal response delayed digit recall task (n-back) that is known to produce graded levels of cognitive demand. Extensive training on all tasks was provided prior to assessment under highway driving conditions. Results for an analysis sample of 60 drivers, equally distributed across both genders and two age groups (20-29 and 60-69), are summarized here and presented in detail in an associated technical report (Reimer, Mehler, Dobres, & Coughlin, 2013). Depending on the task assessed and measure evaluated, both positive features and concerns associated with the use of the voice interface were identified. Physiological arousal during the voice tasks was comparable or lower than that observed during the more difficult level of manual radio tuning task as measured by skin conductance and heart rate, respectively. Perhaps most notable was the identification of a high level of visual demand / engagement during selected tasks, such as the use of the voice-command interface for entering addresses into the navigation system. It also appeared that different age / gender groupings tended to interact with the voice system in different ways. These findings highlight that implementations of voice interfaces can be highly multi-modal and are not necessarily free of visual-manual demands on attentional resources. If one were to apply the current National Highway Transportation Safety Administration (NHTSA) visual-manual distraction guidelines to the tasks assessed, a number of “voice” interactions would not meet the total off-road glance time criteria of the guidelines. While these data were not collected in full alignment with NHTSA’s simulationbased guidelines, the overall structure and metrics are similar, and so this work raises a number of important questions. It is clear that visual demand needs to be considered in the design of multi-modal voice interfaces. This highlights the question of how an acceptable level of visual demand should be defined in the context of multi-step and extended task time interactions that characterize activities involving voice-command interfaces. Finally, the results illustrate the necessity for additional research assessing the generalizability of these findings to other production level and hand-held “voice” interactions, and in developing methods of quantitatively assessing the net attentional costs and benefits of providing drivers with information across different modalities. Voice interactions can play an important role in the vehicle environment. Optimizing the selection of activities in which the driver utilizes voice interaction and the appropriate design of displays will help to maximize driver attentional focus towards information necessary for vehicle operation, while allowing, where appropriate, interactions with interfaces for comfort, convenience and communication functions. White Paper 2013-18A ©MIT AgeLab 2013 Page 2 of 19 Introduction Voice-command interfaces have been proposed, and in some cases aggressively advertised, as a means to allow drivers to engage with an expanding array of entertainment and connectivity options in the modern automobile while keeping their eyes on the road and hands on the steering wheel. This is an intuitively appealing concept, and a reasonably respectable body of computer science, psychology and human factors based laboratory, simulator, and test track studies have identified situations in which primarily experimentally created voice interfaces have shown distinct advantages over visual-manual interfaces in terms of primary task performance (driving or driving like tasks) and glance behavior (see Barón & Green, 2006; Lo & Green, 2013; and Reimer, Mehler, Dobres & Coughlin, 2013 for reviews). However, it is not clear how directly the interactions observed with these types of experimental, hand-held, or aftermarket, voice interfaces generalize to production level automotive systems (systems integrated into the vehicle directly by the manufacturer). Assessments of production level systems have been far fewer in number and generally examine a limited set of task characteristics with modest sized samples (Carter & Graham, 2000; Chiang, et al., 2005; Harbluk, Burns, Lochner, & Trbovich, 2007; Owens, McLaughlin, & Sudweeks, 2010; Shutko, et al., 2009; Shutko & Tijerina, 2011). Furthermore, only Chiang, et al. (2005) and Owens, McLaughlin, & Sudweeks (2010) assessed driver behavior with production systems under actual field driving conditions. While the findings reported in the latter studies have generally presented voice interfaces in a positive light, there is some evidence that voice-based interfaces may not always be completely free of visual-manual demand (see Reimer, Mehler, Dobres, et al., 2013 review). In addition, some questions have been raised about the extent to which “eyes on the road” necessarily equate to “mind on the road”. In other words, to what extent might interaction with a voice interface or audio content from e-mail or a phone conversation result in cognitive demands or absorption that might produce another critical form of distraction, ultimately resulting in a loss of situational awareness? The study summarized in this report was conceived and implemented with the goal of developing a comprehensive assessment of a production-level voice command interface and the demands such a system places on drivers under real-world highway driving conditions. Metrics included visual behavior, physiological arousal as a measure of cognitive demand, driving performance measures, and self-reported workload in younger (20-29 years) and relatively older (60-69 years) samples of drivers broadly representative of the general driving population. Voice control of the radio, music selection from a connected MP3 device, and voice dialing of a stored phone number were selected as basic entertainment and communication tasks. Voice White Paper 2013-18A ©MIT AgeLab 2013 Page 3 of 19 entry of a full street address into a navigation system was of particular interest, since manual entry of addresses into navigation devices while underway is generally recognized as being highly visual-manually demanding. Some OEMs have chosen to lock-out manual entry of addresses while the vehicle is underway, while others allow it. Objectively evaluating the extent to which a voice entry implementation makes this task acceptable under driving conditions is thus quite relevant. Implementations of easy and hard levels of a radio tuning task were developed to support a “side-by-side” comparison of identical tasks using the visual-manual interface and voice interface for the same functional activity in the same vehicle. In addition, three levels of a delayed digital recall task (audio presentation of stimuli with a verbal response from the driver) were included. This task, known as the “n-back”, has been used extensively in research by our group (Mehler & Reimer, 2013; Mehler, Reimer, & Coughlin, 2012; Reimer & Mehler, 2011; Reimer, Mehler, Wang, & Coughlin, 2012) and is known to produce graded levels of cognitive demand as reflected in various physiological measures and self-report levels of workload. It was anticipated that the multiple cognitive demand levels represented by the nback task could be used as a “ruler” against which various responses to the other tasks might be compared. Ranney and colleagues (Ranney et al., 2011) suggested in their exploratory work with the measure that the 2-back condition (the hardest level assessed) could “serve as a starting point for setting a limit for acceptable ‘dose’ of cognitive distraction” (p. 52).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-modal assessment of on-road demand of voice and manual phone calling and voice navigation entry across two embedded vehicle systems

One purpose of integrating voice interfaces into embedded vehicle systems is to reduce drivers' visual and manual distractions with 'infotainment' technologies. However, there is scant research on actual benefits in production vehicles or how different interface designs affect attentional demands. Driving performance, visual engagement, and indices of workload (heart rate, skin conductance, sub...

متن کامل

A Preliminary Assessment of Perceived and Objectively Scaled Workload of a Voice-based Driver Interface

Interaction with a voice-command interface for radio control, destination entry, MP3 song selection, and phone dialing was assessed along with traditional manual radio control and a multi-level audio–verbal calibration task (nback) on-road in 60 drivers. Subjective workload, compensatory behavior, and physiological indices of cognitive workload suggest that there may be both potential benefits ...

متن کامل

Multi-modal demands of a smartphone used to place calls and enter addresses during highway driving relative to two embedded systems

There is limited research on trade-offs in demand between manual and voice interfaces of embedded and portable technologies. Mehler et al. identified differences in driving performance, visual engagement and workload between two contrasting embedded vehicle system designs (Chevrolet MyLink and Volvo Sensus). The current study extends this work by comparing these embedded systems with a smartpho...

متن کامل

Survey of gender effect on driving performance and mental workload of Young Drivers using a driving simulator

Background and aims: Road traffic accident annually lead to the death of 1.2 million people and also the disability of some 50 million people in the world. Iran is one of the countries with the highest rates of road accidents in the world. According to the annual statistics by the Iranian Legal Medicine Organization, 15,932 people have lost their lives in road traffic accidents in 1395 sh. Acco...

متن کامل

Modeling of Stimulus-response Secondary Tasks with Different Modalities While Driving in a Computational Cognitive Architecture

This paper introduces a computational human performance model based upon the queueing network cognitive architecture to predict driver’s eye glances and workload for four stimulus-response secondary tasks (i.e., auditorymanual, auditory-speech, visual-manual, and visual-speech types) while driving. The model was evaluated with the empirical data from 24 subjects, and the percentage of eyes-off-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013